Missing Data and Imputation

Authors

Javier Estrada

Michael Underwood

Elizabeth Subject-Scott

Published

April 10, 2023

Website

Slides

Introduction

Missing Data

Missing data occurs when there are missing values in a dataset. There are many reasons why this occurs. It can be intentional or unintentional and can be classified into the following three categories, otherwise known as missingness mechanisms (Mainzer et al. 2023):

  • Missing completely at random (MCAR) is the probability of missing data being completely independent of any other variables.

  • Missing at random (MAR) is the probability of missing data being related to the observed values.

  • Missing not at random (MNAR) is the probability of missing data being dependent on the missing and observed values.

Figure 1: Graphical Representation of Missingness Mechanisms (Schafer and Graham 2002)

(X are the completely observed variables. Y are the partly missing variables. Z is the component of the cause of missingness unrelated to X and Y. R is the missingness.)

Looking for patterns in the missing data can help us to determine which category they belong. These mechanisms are important in determining how to handle the missing data. MCAR would be the best case scenario but seldom occur. MAR and MNAR are more common.

The problem with ignoring any missing values is that it does not give a true representation of the dataset and can lead to bias when analyzing. This reduces the statistical power of the analysis (van_Ginkel et al. 2020). To enhance the quality of the research, the following should be followed: explicitly acknowledge missing data problems and the conditions under which they occur and employ principled methods to handle the missing data (Dong and Peng 2013).

Methods to Deal with Missing Data

There are three types of methods to deal with missing data, the likelihood and Bayesian method, weighting methods, or imputation methods (Cao et al. 2021). Missing data can also be handled by simply deleting.

  • Likelihood Bayesian method is when information from a previous predictive distribution is combined with evidence obtained in a sample to predict a value. It requires technical coding and advanced statistical knowledge.

  • The weighting method is a traditional approach when weights from available data are used to adjust for non-response in a survey. Inefficiency occurs when there are extreme weights or a need for many weights.

  • The imputation method is when an estimate from the original dataset is used to estimate the missing value. There are two types of imputation: single and multiple.

Deleting missing data

Listwise deletion is when the entire observation is removed from the dataset. Deleting missing data can lead to the loss of important information regarding your dataset and is therefore not recommended. In certain cases, when the amount of missing data is small and the type is MCAR, listwise deletion can be used. There usually won’t be bias but potentially important information may be lost.

T-tests and chi-square tests can be used to assess pairs of predictor variables to determine whether the groups’ means differ significantly. According to (van_Ginkel et al. 2020), if significant, the null hypothesis is rejected, therefore, indicating that the missing values are not randomly scattered throughout the data. This implies that the missing data is MAR or MNAR. Conversely, if nonsignificant, this implies that the data cannot be MAR. This does not eliminate the possibility that it is not MNAR–other information about the population is needed to determine this.

Whenever missing data is categorized as MAR or MNAR, listwise deletion would be wasteful, and the analysis biased. Alternate methods of dealing with the missing data is recommended: either pairwise deletion or imputation.

Pairwise deletion is when only the missing variable of an observation is removed. It allows more data to be analyzed than listwise deletion but limits the ability to make inferences of the total sample. For this reason, it is recommended to use imputation to properly deal with missing data.

Preferred Method to Handle Missing Data

Imputation is the preferred method to handle missing data. It consists of replacing missing data with an estimate obtained from the original, available data. After imputation, there will be a full dataset to analyze. To improve statistical power, the number of imputations created should be at least equal to the percent of missing data (5% equals 5 imputations, 10% equals 10 imputations, 20% equals 20 imputations, etc.) (Pedersen et al. 2017). According to (Wulff and Jeppesen 2017), 3-5 imputations are sufficient, and 10 are more than enough.

Single, or univariate, imputation is when only one estimate is used to replace the missing data. Methods of single imputation include using the mean, the last observation carried forward, and random imputation. The following is a brief explanation of each:

  • Using the mean to replace a missing value is a straight-forward process. The mean of the dataset is calculated, including the missing value. The mean is then multiplied by the number of observations in the study. Next, the known values are subtracted from the product, and this gives an estimate that can be used for any missing values. The problem with this method is that it reduces the variance which leads to a smaller confidence interval.

  • Last Observation Carried Forward (LOCF) is a technique of replacing a missing value in longitudinal studies with a previously observed value (the most recent value is carried forward) (Streiner 2008). The problem with this method is that it assumes that the previous observed value is perpetual when in reality that most likely is not the case.

  • Random imputation is a method of randomly drawing an observation and using that observation for any of the missing values. The problem with this method is that it introduces additional variability.

These single imputation methods are flawed. They often result in underestimation of standard errors or too small p-values (Dong and Peng 2013), which can cause bias in the analysis. Therefore, multiple imputation is the better method because it handles missing data better and provides less biased results.

Multiple, or multivariate, imputation is when various estimates are used to replace the missing data by creating multiple datasets from versions of the original dataset. It can be done by using a regression model, or a sequence of regression models, such as linear, logistic and Poison. A set of m plausible values are generated for each unobserved data point, resulting in M complete data sets (Dong and Peng 2013). The new values are randomly drawn from predictive distributions either through joint modeling (JM, which is not used much anymore) or fully conditional specification (FCS) (Wongkamthong and Akande 2023). It is then analyzed and the results are combined to obtain a single value for the missing data.

The purpose of multiple imputation is to create a pool of imputed data for analysis, but if the pooled results are lacking, then multiple imputation should not be done (Mainzer et al. 2023). Another reason not to use multiple imputation is if there are very few missing values; there may be no benefit in using it. Also worth noting is some statistical analyses software already have built-in features to deal with missing data.

Multiple imputation by chained methods, otherwise known as MICE, is the most common and preferred, method of multiple imputation (Wulff and Jeppesen 2017). It provides a more reliable way to analyze data with missing values. For this reason, this paper will focus on the methodology and application of the MICE process.

Code
#loading packages
library(DiagrammeR)

Figure 2: Flowchart of the MICE-process based on procedures proposed by Rubin (Wulff and Jeppesen 2017)

Code
DiagrammeR::grViz("digraph {

# initiate graph
graph [layout = dot, rankdir = LR, label = 'The MICE-Process\n\n',labelloc = t, fontcolor = DarkSlateBlue, fontsize = 45]

# global node settings
node [shape = rectangle, style = filled, fillcolor = AliceBlue, fontcolor = DarkSlateBlue, fontsize = 35]
bgcolor = none

# label nodes
incomplete [label =  'Incomplete data set']
imputed1 [label = 'Imputed \n data set 1']
estimates1 [label = 'Estimates from \n analysis 1']
rubin [label = 'Rubin rules', shape = diamond]
combined [label = 'Combined results']
imputed2 [label = 'Imputed \n data set 2']
estimates2 [label = 'Estimates from \n analysis 2']
imputedm [label = 'Imputed \n data set m']
estimatesm [label = 'Estimates from \n anaalysis m']


# edge definitions with the node IDs
incomplete -> imputed1 [arrowhead = vee, color = DarkSlateBlue]
imputed1 -> estimates1 [arrowhead = vee, color = DarkSlateBlue]
estimates1 -> rubin [arrowhead = vee, color = DarkSlateBlue]
incomplete -> imputed2 [arrowhead = vee, color = DarkSlateBlue]
imputed2 -> estimates2 [arrowhead = vee, color = DarkSlateBlue]
estimates2-> rubin [arrowhead = vee, color = DarkSlateBlue]
incomplete -> imputedm [arrowhead = vee, color = DarkSlateBlue]
imputedm -> estimatesm [arrowhead = vee, color = DarkSlateBlue]
estimatesm -> rubin [arrowhead = vee, color = DarkSlateBlue]
rubin -> combined [arrowhead = vee, color = DarkSlateBlue]
}")

*Rubin’s Rules: Average the estimates across m estimates. Calculate the standard errors and variance of m estimates. Combine using an adjustment term (1+1/m).

Other Methods of Imputation

There are other methods of imputation worth noting and are briefly descrbied below.

Regression Imputation is based on a linear regression model. Missing data is randomly drawn from a conditional distribution when variables are continuous and from a logistic regression model when they are categorical (van_Ginkel et al. 2020).

Predictive Mean Matching is also based on a linear regression model. The approach is the same as regression imputation when it comes to categorical missing values but different for continuous variables. Instead of random draws from a conditional distribution, missing values are based on predicted values of the outcome variable (van_Ginkel et al. 2020).

Hot Deck (HD) imputation is when a missing value is replaced by an observed response of a similar unit, also known as the donor. It can be either random or deterministic (based on a metric or value) (Thongsri and Samart 2022). It does not rely on model fitting.

Stochastic Regression (SR) Imputation is an extension of regression imputation. The process is the same but a residual term from the normal distribution of the regression of the predictor outcome is added to the imputed value (Thongsri and Samart 2022). This maintains the variability of the data.

Random Forest (RF) Imputation is based on machine learning algorithms. Missing values are first replaced with the mean or mode of that particular variable and then the dataset is split into a training set and a prediction set (Thongsri and Samart 2022). The missing values are then replaced with predictions from these sets. This type of imputation can be used on continuous or categorical variables with complex interactions.

Methodology

Multiple Imputation by Chained Equations (MICE)

In multiple imputation, m imputed values are created for each of the missing data and result in M complete datasets. For each of the M datasets, an estimate of \(\theta\) is acquired.

Combined estimator of \(\theta\) is given by:

\({\hat{\theta}}_{M}\)=\(\displaystyle \frac{1}{M}\)\(\sum_{m = 1}^{M} {\hat{\theta}}_{m}\)

The proposed variance estimator of \({\hat{\theta}}_{M}\) is given by:

\({\hat{\Phi}}_{M}\) = \({\overline{\phi}}_{M}\)+(1+\(\displaystyle \frac{1}{M}\))B\(_{M}\)

where \({\overline{\phi}}_{M}\) = \(\displaystyle \frac{1}{M}\)\(\sum_{m = 1}^{M}\)\({\hat{\phi}}_m\)

and B\(_{M}\) = \(\displaystyle \frac{1}{M-1}\)\(\sum_{m = 1}^{M}\)(\({\hat{\theta}}_{m}\)-\({\overline{\theta}}_{M}\))\(^{2}\)

(Arnab 2017)

The chained equation process has the following steps (Azur et al. 2011):

Step 1:

Using simple imputation, replace the missing data with this value, referred to as the “place holder”.

Step 2:

The “place holder” values for one variable are set back to missing.

Step 3:

The observed values from this variable (dependent variable) are regressed on the other variables (independent variables) in the model, using the same assumptions when performing linear, logistic, or Poison regression.

Step 4:

The missing values are replaced with predictions “m” from this newly created model.

Step 5:

Repeat Steps 2-4 for each variable that have missing values until all missing values have been replaced.

Step 6:

Repeat Steps 2-4, updating imputations each cycle for as many “m” cycles/imputations that are required.

Analysis and Results

Data and Visualizations

Load Data and Packages
Code
# load data
credit = read.csv("credit_data.csv")

# load libraries
library(gtsummary)
library(dplyr, warn.conflicts=FALSE)
library(mice, warn.conflicts=FALSE)
Description of Dataset

Credit score data

Details of Dataset

The credit.csv file is from the website of Dr. Lluís A. Belanche Muñoz, by way of a github repository of Dr. Gaston Sanchez. It contains data of 4,454 subjects and stores a combination of continuous, categorical and count values for 15 variables. Of the 15 variables, the “Status” variable contains binomial categorical values of “good” and “bad” to describe the kind of credit score each subject has. One data point is missing an outcome and was removed from the original data.

Definition of Data in Dataset
Variable Type Description
X Integer Count variable indicating the number of subjects.
Status Character 2-level categorical variable indicating the status of the subject’s credit: good or bad.
Seniority Integer Count variable indicating the seniority a subject has accumulated over the course of their life.
Home Character 6-level categorical variable indicating the subject’s relationship to their residential address: rent, owner, parents, priv, other, or ignore.
Time Integer Count variable showing how many months has elapsed since the subject’s payment deadline without paying their debt full.
Age Integer Count variable indicating subject’s age (in years).
Marital Character 5-level categorical variable indicating the subject’s marital status: single, married, separated, divorced, or widow.
Records Character 2-level categorical variable indicating whether the subject has a credit history record: yes or no.
Job Character 4-level categorical variable indicating the type of job the subject has: fixed, freelance, partime, or others.
Expenses Integer Count variable indicating the amount of expenses (in USD) a subject has.
Income Integer Count variable indicating the amount of income (in thousands of USD) a subject earns annually.
Assets Integer Count variable indicating the amount of assets (in USD) a subject has.
Debt Integer Count variable indicating the amount of debt (in USD) a subject has.
Amount Integer Count variable indicating the amount of money (in USD) remaining in a subject’s bank account.
Price Integer Count variable indicating the amount of money a subject earns by the end of the month.
Summary of Dataset:
Code
credit %>%
  tbl_summary(by = Status,
              missing_text = "NA") %>%
  add_p() %>%
  add_n() %>%
  add_overall %>%
  modify_header(label ~ "**Variable**") %>%
  modify_caption("**Summary of Credit Data**") %>%
  bold_labels()
Summary of Credit Data
Variable N Overall, N = 4,4541 bad, N = 1,2541 good, N = 3,2001 p-value2
X 4,454 2,228 (1,114, 3,341) 2,222 (1,142, 3,366) 2,232 (1,098, 3,326) 0.3
Seniority 4,454 5 (2, 12) 2 (1, 6) 7 (2, 14) <0.001
Home 4,448 <0.001
    ignore 20 (0.4%) 9 (0.7%) 11 (0.3%)
    other 319 (7.2%) 146 (12%) 173 (5.4%)
    owner 2,107 (47%) 390 (31%) 1,717 (54%)
    parents 783 (18%) 233 (19%) 550 (17%)
    priv 246 (5.5%) 84 (6.7%) 162 (5.1%)
    rent 973 (22%) 388 (31%) 585 (18%)
    NA 6 4 2
Time 4,454 48 (36, 60) 48 (36, 60) 48 (36, 60) <0.001
Age 4,454 36 (28, 45) 34 (27, 42) 36 (28, 46) <0.001
Marital 4,453 <0.001
    divorced 38 (0.9%) 14 (1.1%) 24 (0.8%)
    married 3,241 (73%) 829 (66%) 2,412 (75%)
    separated 130 (2.9%) 64 (5.1%) 66 (2.1%)
    single 977 (22%) 328 (26%) 649 (20%)
    widow 67 (1.5%) 19 (1.5%) 48 (1.5%)
    NA 1 0 1
Records 4,454 773 (17%) 429 (34%) 344 (11%) <0.001
Job 4,452 <0.001
    fixed 2,805 (63%) 580 (46%) 2,225 (70%)
    freelance 1,024 (23%) 333 (27%) 691 (22%)
    others 171 (3.8%) 68 (5.4%) 103 (3.2%)
    partime 452 (10%) 271 (22%) 181 (5.7%)
    NA 2 2 0
Expenses 4,454 51 (35, 72) 49 (35, 75) 52 (35, 68) 0.8
Income 4,073 125 (90, 170) 100 (74, 148) 130 (100, 178) <0.001
    NA 381 217 164
Assets 4,407 3,000 (0, 6,000) 0 (0, 4,000) 4,000 (0, 7,000) <0.001
    NA 47 20 27
Debt 4,436 0 (0, 0) 0 (0, 0) 0 (0, 0) 0.3
    NA 18 13 5
Amount 4,454 1,000 (700, 1,300) 1,100 (800, 1,415) 1,000 (700, 1,250) <0.001
Price 4,454 1,400 (1,117, 1,692) 1,423 (1,062, 1,728) 1,400 (1,134, 1,678) >0.9
1 Median (IQR); n (%)
2 Wilcoxon rank sum test; Pearson's Chi-squared test
Evaluate Dataset

First, we evaluate the dataset for missing values. As indicated in the table, the data does contain NA/missing values. We can create a table that shows each variable and how many missing values they have:

Code
# Shows which variables have missing values and how many
colSums(is.na(credit))
        X    Status Seniority      Home      Time       Age   Marital   Records 
        0         0         0         6         0         0         1         0 
      Job  Expenses    Income    Assets      Debt    Amount     Price 
        2         0       381        47        18         0         0 

We now must analyze the data to see how we intend to handle the missing values. In order to do this, we need to create a new dataset, called new_credit, that deletes the missing data. We want to perserve the original dataset so we can implement the method we intend to use to address the missing values. We can then generate a count of rows to determine how many values were deleted in total.

Code
# Creates a new dataset excluding missing values 
new_credit = na.omit(credit)

# Number of rows of new dataset
nrow(new_credit)
[1] 4039

We started out with 4,454 rows and our new dataset has 4,039. 415 rows were deleted due to the missing data. To run regression, we would be throwing away 9.3% of our data, because of missingness. Instead, we can use multiple imputation to impute the missing values so that we don’t have to discard such valuable information.

MICE in R

Using the MICE (Multivariate Imputation by Chained Equations) package in R, a statistical programming software, we will create multiple datasets with imputed values for the missing values. Because our dataset contains just under 10% of missing data, we will generate 10 imputations, or 10 new datasets. The MICE package seamlessly does this by creating plausable values from other columns and places them into the intersections of rows and columns with missing data.

First step is to check the missingness by looking for patterns in the original dataset using the md.pattern() function:

Code
credit <- credit[-c(1)]
md.pattern(credit, rotate.names = TRUE)

     Status Seniority Time Age Records Expenses Amount Price Marital Job Home
4039      1         1    1   1       1        1      1     1       1   1    1
366       1         1    1   1       1        1      1     1       1   1    1
22        1         1    1   1       1        1      1     1       1   1    1
7         1         1    1   1       1        1      1     1       1   1    1
8         1         1    1   1       1        1      1     1       1   1    1
4         1         1    1   1       1        1      1     1       1   1    1
3         1         1    1   1       1        1      1     1       1   1    0
2         1         1    1   1       1        1      1     1       1   1    0
1         1         1    1   1       1        1      1     1       1   0    1
1         1         1    1   1       1        1      1     1       1   0    0
1         1         1    1   1       1        1      1     1       0   1    1
          0         0    0   0       0        0      0     0       1   2    6
     Debt Assets Income    
4039    1      1      1   0
366     1      1      0   1
22      1      0      1   1
7       1      0      0   2
8       0      0      1   2
4       0      0      0   3
3       0      0      1   3
2       0      0      0   4
1       1      1      0   2
1       0      0      0   5
1       1      1      1   1
       18     47    381 455

Blue is observed values and red is missing values. There are 11 patterns.

In order to perform multiple imputation on categorical data, all string variables must be converted to factors using the as.factor() function:

Code
credit$Status = as.factor(credit$Status)
credit$Home = as.factor(credit$Home)
credit$Marital = as.factor(credit$Marital)
credit$Records = as.factor(credit$Records)
credit$Job = as.factor(credit$Job)

Using the mice() function, 10 multiple imputations for the missing values will be generated. The default is 5, so you must set m = to the number of imputations that you desire. Since the data type of the variables in the dataset are of both numerical and categorical nature (with 2 and more levels), the defaultMethod argument will contain pmm: predictive mean matching (numeric data); logreg: logistic regression imputation (binary data, factor with 2 levels); polyreg: polytomous regression imputation for unordered categorical data (factor > 2 levels); polr: proportional odds model for (ordered, > 2 levels). The set.seed will be given the value 1337 (any number can be used here) to retrieve the same results each time the multiple imputation is performed.

Code
Multiple_Imputation = mice(data = credit,  maxit = 10, m = 10, defaultMethod = c("pmm", "logreg", "polyreg", "polr"), set.seed = 1337)

 iter imp variable
  1   1  Home  Marital  Job  Income  Assets  Debt
  1   2  Home  Marital  Job  Income  Assets  Debt
  1   3  Home  Marital  Job  Income  Assets  Debt
  1   4  Home  Marital  Job  Income  Assets  Debt
  1   5  Home  Marital  Job  Income  Assets  Debt
  1   6  Home  Marital  Job  Income  Assets  Debt
  1   7  Home  Marital  Job  Income  Assets  Debt
  1   8  Home  Marital  Job  Income  Assets  Debt
  1   9  Home  Marital  Job  Income  Assets  Debt
  1   10  Home  Marital  Job  Income  Assets  Debt
  2   1  Home  Marital  Job  Income  Assets  Debt
  2   2  Home  Marital  Job  Income  Assets  Debt
  2   3  Home  Marital  Job  Income  Assets  Debt
  2   4  Home  Marital  Job  Income  Assets  Debt
  2   5  Home  Marital  Job  Income  Assets  Debt
  2   6  Home  Marital  Job  Income  Assets  Debt
  2   7  Home  Marital  Job  Income  Assets  Debt
  2   8  Home  Marital  Job  Income  Assets  Debt
  2   9  Home  Marital  Job  Income  Assets  Debt
  2   10  Home  Marital  Job  Income  Assets  Debt
  3   1  Home  Marital  Job  Income  Assets  Debt
  3   2  Home  Marital  Job  Income  Assets  Debt
  3   3  Home  Marital  Job  Income  Assets  Debt
  3   4  Home  Marital  Job  Income  Assets  Debt
  3   5  Home  Marital  Job  Income  Assets  Debt
  3   6  Home  Marital  Job  Income  Assets  Debt
  3   7  Home  Marital  Job  Income  Assets  Debt
  3   8  Home  Marital  Job  Income  Assets  Debt
  3   9  Home  Marital  Job  Income  Assets  Debt
  3   10  Home  Marital  Job  Income  Assets  Debt
  4   1  Home  Marital  Job  Income  Assets  Debt
  4   2  Home  Marital  Job  Income  Assets  Debt
  4   3  Home  Marital  Job  Income  Assets  Debt
  4   4  Home  Marital  Job  Income  Assets  Debt
  4   5  Home  Marital  Job  Income  Assets  Debt
  4   6  Home  Marital  Job  Income  Assets  Debt
  4   7  Home  Marital  Job  Income  Assets  Debt
  4   8  Home  Marital  Job  Income  Assets  Debt
  4   9  Home  Marital  Job  Income  Assets  Debt
  4   10  Home  Marital  Job  Income  Assets  Debt
  5   1  Home  Marital  Job  Income  Assets  Debt
  5   2  Home  Marital  Job  Income  Assets  Debt
  5   3  Home  Marital  Job  Income  Assets  Debt
  5   4  Home  Marital  Job  Income  Assets  Debt
  5   5  Home  Marital  Job  Income  Assets  Debt
  5   6  Home  Marital  Job  Income  Assets  Debt
  5   7  Home  Marital  Job  Income  Assets  Debt
  5   8  Home  Marital  Job  Income  Assets  Debt
  5   9  Home  Marital  Job  Income  Assets  Debt
  5   10  Home  Marital  Job  Income  Assets  Debt
  6   1  Home  Marital  Job  Income  Assets  Debt
  6   2  Home  Marital  Job  Income  Assets  Debt
  6   3  Home  Marital  Job  Income  Assets  Debt
  6   4  Home  Marital  Job  Income  Assets  Debt
  6   5  Home  Marital  Job  Income  Assets  Debt
  6   6  Home  Marital  Job  Income  Assets  Debt
  6   7  Home  Marital  Job  Income  Assets  Debt
  6   8  Home  Marital  Job  Income  Assets  Debt
  6   9  Home  Marital  Job  Income  Assets  Debt
  6   10  Home  Marital  Job  Income  Assets  Debt
  7   1  Home  Marital  Job  Income  Assets  Debt
  7   2  Home  Marital  Job  Income  Assets  Debt
  7   3  Home  Marital  Job  Income  Assets  Debt
  7   4  Home  Marital  Job  Income  Assets  Debt
  7   5  Home  Marital  Job  Income  Assets  Debt
  7   6  Home  Marital  Job  Income  Assets  Debt
  7   7  Home  Marital  Job  Income  Assets  Debt
  7   8  Home  Marital  Job  Income  Assets  Debt
  7   9  Home  Marital  Job  Income  Assets  Debt
  7   10  Home  Marital  Job  Income  Assets  Debt
  8   1  Home  Marital  Job  Income  Assets  Debt
  8   2  Home  Marital  Job  Income  Assets  Debt
  8   3  Home  Marital  Job  Income  Assets  Debt
  8   4  Home  Marital  Job  Income  Assets  Debt
  8   5  Home  Marital  Job  Income  Assets  Debt
  8   6  Home  Marital  Job  Income  Assets  Debt
  8   7  Home  Marital  Job  Income  Assets  Debt
  8   8  Home  Marital  Job  Income  Assets  Debt
  8   9  Home  Marital  Job  Income  Assets  Debt
  8   10  Home  Marital  Job  Income  Assets  Debt
  9   1  Home  Marital  Job  Income  Assets  Debt
  9   2  Home  Marital  Job  Income  Assets  Debt
  9   3  Home  Marital  Job  Income  Assets  Debt
  9   4  Home  Marital  Job  Income  Assets  Debt
  9   5  Home  Marital  Job  Income  Assets  Debt
  9   6  Home  Marital  Job  Income  Assets  Debt
  9   7  Home  Marital  Job  Income  Assets  Debt
  9   8  Home  Marital  Job  Income  Assets  Debt
  9   9  Home  Marital  Job  Income  Assets  Debt
  9   10  Home  Marital  Job  Income  Assets  Debt
  10   1  Home  Marital  Job  Income  Assets  Debt
  10   2  Home  Marital  Job  Income  Assets  Debt
  10   3  Home  Marital  Job  Income  Assets  Debt
  10   4  Home  Marital  Job  Income  Assets  Debt
  10   5  Home  Marital  Job  Income  Assets  Debt
  10   6  Home  Marital  Job  Income  Assets  Debt
  10   7  Home  Marital  Job  Income  Assets  Debt
  10   8  Home  Marital  Job  Income  Assets  Debt
  10   9  Home  Marital  Job  Income  Assets  Debt
  10   10  Home  Marital  Job  Income  Assets  Debt

The following R code will show the imputed values. Columns are imputations, rows are observations.

Code
Multiple_Imputation$imp
$Status
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Seniority
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Home
           1     2     3       4       5       6       7       8       9
30   parents owner  priv    priv    priv    rent   owner   other    rent
240    owner owner owner   owner   owner   other    priv   owner parents
1060 parents other owner parents parents parents parents parents parents
1677   other owner owner parents   owner   owner   owner   owner   owner
2389    rent owner  rent   owner parents   other    rent    rent   other
2996   owner  priv  rent   owner   owner   owner   owner   owner   owner
          10
30      rent
240    owner
1060 parents
1677    priv
2389   owner
2996    priv

$Time
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Age
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Marital
           1        2     3       4       5       6       7       8     9
3319 married divorced widow married married married married married widow
          10
3319 married

$Records
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Job
          1         2     3         4       5       6       7         8
30   others freelance fixed freelance   fixed   fixed partime     fixed
912 partime   partime fixed     fixed partime partime partime freelance
            9    10
30  freelance fixed
912   partime fixed

$Expenses
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Income
       1   2   3   4   5   6   7   8   9  10
30   105 219  90  27 152  95  85 140  62  80
114  105  80  70 109  70  32  87 110 120  16
144  144  80 210 250 115 184 330 959  80 320
153  113  80 131 118  50  69  67  57 124 140
158  220 129 159 139 196 115 205 140 140 132
177  250 254 245 300 250 300  94 250 178 254
195  187 156  95 220 100 104  90 100 250 158
206  200  75 181 130 150 100 110 179 224 250
241  160 155 180 250 166  68 110 177 150  68
242   80 129 189  50 196 177 126 135 188 172
278   80 105 100 153 153 100 143 140 184 197
318  101  80  95 105  79 228  85 130  41  92
330   95 120 140 150  95 240 100 135 122 230
333  223 210 190 136  98 110 167  35 190 193
335  132 163 132 200 126  96 140 117 168 144
356  163 125  70 100 142 125  92 114 141 117
360   70 133  80 107  50  80 105 132 234  59
394  150 350 350 500 500 500 491 150 500 500
404   25 105 200  55 115 108 144 123 110 147
422  165 242  46 123 205 120  98 120 160 110
439  120 212 175 119 223  95 110 110 120 180
444   67 167  70 100 150  95  93 250  75 107
462  139 168 147 243 220 128 230 150  77 100
469  120 210 144  78 142 154 110 150 366  95
479  300  60 102 112 100 227  86  90 135 250
481  200 158 113 132 100 170 105  50 205 200
483   72 233 165 130  77 184 205  84 120 100
485  125 200 175 147 144  63 154 366 240 147
496  100  80 187 113 102  82  90  60 100 150
498   72 101 208 245 100 170 158 125  85 150
505   75  81 171 125 222 107  57 134  76 103
567  113 113 172 104 176 158  64 179 160 126
572   91  82  65  57  80  67 140 124 181 121
582   70 182  35  49  67 182  65  19  56  56
648  250 250 191 125 210  60 260  50 500 288
653   62 161 180 260  92  97 122 130 110 195
667  416 416  91 245 241 250 142 245 250 230
675  107 230 250 250  88  53 208 166  42 250
678  126 115 120 185 225 107  60 140  73 140
699  184 137  67  60 176 130 188 100 172 120
708  340 124  72 235 107  80  52 104 179  71
714   90 105  48  85  99 142  75 132 283 100
716   80  93  42  80 103  38  70  52 114  86
733  136 124 124 366 168 119  70 123 104 150
734   92 178   8  60 130 160 155 133 125 107
746   85  63 135 110 185  98 140 140 118 110
777  104  70  75  75 192  65 136 150  70 209
781  134 100 140 149 122 117 155  87 143 160
785  150 120 170  65 200 165 314 464 150 464
804  100 211  71 122 121 135 300 110 210 145
824   86 113 175 175 150  55 138 105 182  50
865  120 123  80  86 148  90 100 149 127 204
866  100  92 100 100 128 107  82 100 157  62
880  315 102 154  50 129 108 104 107 140 212
889  428 442 172 190 111 150  41 350 200 156
906  210 350 225 214 100 382  70 235 122 394
912  120  53 120  80 150 158 135 113  95 128
942   69  65  75  38  70  79  82 140 110 106
952  139 215 115 144 225 188 140 125 121 130
989   78 100  80 103 147  76  71  74  50   8
1001  40 140  77  53  70  63  92  66  63  59
1017 145  70 233 135 117  68 113  63 185 150
1039 213 191 233 200 120  72 107 150 167 234
1044 140 149  75  75  90  78  72  80  90  80
1069 155 134 138 250 176  92 190 178 110 333
1100  62 102  40 125  94  22 136  87  77 107
1111 120 115 115 121  94 127  65 173  82  48
1125  40 315 166 178 184 139 300 100 153 306
1168 470 145 157  92 230 205 100 175 202 187
1208 245  60 150 160  96 117 158 220 200 214
1226 161 160 250  99 250 161 120 176  61 250
1250  67 200  86 102 201 150 117 100 160  64
1257  58 246 390 150 333 158 152 130 154  49
1276 426 125 210 182 200 117 178  85  45  90
1281 105 120 107 160 180 216  80  80  56  75
1289  72  75  63  65 162  75  90 103 103  38
1297 190 165  70 136 130 115  91 109  85 112
1307  96 245 110 110  96  90 121 195 177 172
1314 218  84 100 159 200 290 218 137 127 324
1335 185 117 180 100 117  70 178 134 213 146
1364 154 240 183 158 289  60 228 214  79 857
1365 185 145 165 170 129 125 120 250  53 212
1366  75 250 154  90 130 210 132 268 148 148
1392 500 491 491 491 350 500 150 150 500 500
1421 167 100 170 125 167 204 186 145  70 173
1427  90 230 100 270 196 200  90 156 143 100
1433 217 132 222  63  99   8 100 105  68  66
1436  81 100 105  85 176 113 147 159 156  87
1437  85  60 240  83 163 125  80 125 140 215
1441  67 134 104  94  86  60 144 110 139  65
1456 120  41  42 105  65 107  57  91  60  58
1473 110 120 235 200 118 107 413 165 113 133
1509  80 156  90 118 140 141  46  17  86 135
1513  56 110  87  75  50  65  75 135 150 120
1530 117  66 140 120  80 149 166 114  80  80
1535 200 108 178 135  57 138 100 184 112 107
1536 152 257 300 250 373 400 231 275 532 230
1544 100 107 111 110 100 109 160 105 195 166
1549 125 150 120  76 171 122  96 110 120 189
1564  97 122  95 100 110  25 180 164  56 120
1580 135 130 114  54  87 110 178 170  86  81
1583 131 110 116 450  87  88 150  42  78 195
1598 260 146 240 107  49  35 216  90 190 195
1599  64 106 100  80 165 109 115 220  82 140
1619 152 125 112 100  80  72 145 139 232 159
1629 364 200  52  67 142 160 125 101 236  75
1648 138 135 130 116 100  26  63  75  70  48
1662  95 236  80 245  95 104 121 200 107 135
1677  90 120  71 115 120 120 123 208 340  93
1685 250  92  76 145 125  85 180 104 210 103
1722 180 100 150  81 200  55 207 154  81 225
1724 120  37 161 200 118 118 112 184 108 111
1733 220 214 230 275 250 170 318 310 130 213
1741  92  77  55  86  95  63  70  60  49 105
1745 120 150 149 260 270 131 230 203 172 146
1753  60 136  50  72  82  60  50 136 126 135
1762 150 139 108 107  70 260  76  74 111 105
1766  93 200 128 107 857 150 275 191 315 104
1771 170 106 214  60  93 230 186 219  20 223
1798 108 100 112 118 263 231 180  80 143  70
1802 491 150 150 500 150 150 500 500 491 905
1803  69  89 131  80 144 225 120 150  90 160
1807 137 115  86  53 100 190 120  68  85  70
1811  69  90 146 100 199  75 143 221 100 150
1844  85 199 256 169 100 113 178 245 130  46
1851 700 250 275 178 350 250 120 125 230 260
1852 100 107 117 100  67 135 120  50  85  90
1870 330 150 352  60 177 205 177 202 251  71
1872  75 101  90  90 185 118 117 131  85  89
1882 136  60  72  77  48 100 120  50  92  71
1883 150  54 300 106 140 115 117 250 340  66
1893 150 350 150 905 200 491 491 500 491 150
1898  80  86  80 117 195  60  86  60 225 150
1903  47 139 100 202  60  57  60  57 120  92
1907 250 535 275  93 274 100 118 214 314 120
1920  60 101  93  74  82  35 100 116 135 123
1936 160 201 100 125 165 207 200 341 200 125
1946 137 101  55 100 236 112 120 109  90  79
1948  53 101  65 130  75  85  64  85  85  71
1962 115 114 179 104 318 148  55 100 110 120
1963 341 220 260 182 318 130 219 138 250 188
1965  86  80  91  60 176 300  80 130 200  80
1970 150 100 500 230 150 283 250 250 100 189
1972 500 500 491 491 500 150 500 500 905 200
1977 242 276 293 198 180 190 157 100 137 247
1979 100 109 186  90 100 105 180  95  85 107
1980 100  80 182  50 119  70 128  72  70  58
1984 274 143 218  41  60 195 186 145 200 240
2006 100 195 143  42 145 103  68  52 154 241
2016 260 230 100 140 180 199 250 114 280 214
2022  86 142  93  85  62 184 139 150 150 110
2025 266 251 191 148 313 220  68 245 289 300
2042 102 178  60  50 200 160 127 142 103 128
2043  16  96 140  95 140  81  65 190  80 163
2076  90 122 101  73 155 130  64 140  95  80
2077 152 169  38 160 240 139 123 200 187  52
2083  92  71 121 166 202 162 189  42 240  85
2156 150  62 202 260  77 300  85 198 200  96
2157 218 125  70  75  87  90  19  35 151 130
2186 110  60 105 200 140 234  92 140  92 125
2197 100 140 175 169 165  82 150 238 150 112
2205 100  42 150 428 315 210 115 150 380 180
2218 100 250 187  65 100 120  59 192 100 134
2227  50  30  55  46  51  72  50  70  63 138
2233  55  91 143 120  60 120  70  92 149 250
2240 341 174  66  78 189 100 209 274 120 208
2257 106  76 184 118 120 220  88 162  88 115
2280 275 260 178 137 180 133 321 531 257 180
2291 293 146 101 300 224 140 190 158 113 384
2297  91  70  84 119 135 115  81 150 128 138
2304  53  52  85  42  56  70  32 135  31  33
2310 150 111 315 150 360 532  74 230 500 106
2323  70  26  76 135  45 130  56  58  65 121
2331 500 500 241 905 500 150 241 905 241 491
2337  51  80  27  42 152  80  73 114 113  75
2349 120 500 303 120 120 300 100 142 186 165
2365 700 155 100 158 200 160 200 390 170 212
2369 500 140 147 199 230 178 230 225 190 121
2387 135 219  66  75 116 150 120  90 139  72
2396  80  63 105  68 120 125  98  74 440 140
2399 150 200 117 121 500  81 223 138 466 155
2402 265 200 199 700 200 191 183 150  79 300
2404 139 157 110 150 300  63 106 165 150 200
2437 100 151 151 250 350 115 257 137 315 275
2445 170 121  25 128 181 115 200  75 199  93
2446  76 140 145 144 123 129 220 187 146 120
2453 115 109 178  75 110 170 115 113  78  81
2460 110  75  78 139  95  60  92  45  45  90
2467  40  63  37 109 130  59  50 120  72  73
2473  41  70  65 129 116 128 150  82 110  98
2490  85  96  82  42  85 182  52  88  46 152
2495  80  84 125 130  63  78  99  48  85  63
2505 318 107  93 268 260 200 183 340 273 130
2566 148  93  92 120 129 116 142 221 327  70
2572 150 128 140 148 137 112 125  80 160 143
2578 176 112  62 116  90  87 160 164  83 319
2584  69  75 110  48  42 130 100  79 150 100
2596  67  78  60  83  92 120  60  80 166  72
2605 100  90 100 126 123  82  73  55  75  85
2614 110 114 191 126 206  60  85 125 131 180
2624  75 100 210 104 180 118  41  61  39 120
2625 155 162 165  61 155  43 177 160 107 150
2631  91 175 110  98  76  24 125 212 115 208
2632 137  90  68  63 126  60 140 103  88 137
2651  53 104  83 102 130 100 147 140 185 150
2652 148  94 100 104  79 108  70 150 120 170
2653 250  90  60  88 105  92 130  81  90 125
2668 213 400  50  85  64 177 250  90 170 144
2676 115 107  89  72 160 132  61 121 170 260
2681 163 110  77  99 100 100 250 128 100 220
2683  80 106 103 142 160  85 120  90  60  71
2695 141  90  80 169 160  90 107 158 152  47
2696  92  67 127  83  73 156  47 105  92  87
2707 200 119 142  90  81 157 104 140  90 137
2720 210 211 210 164 250 100  53 110 478  70
2723 100 147  57  45  87  83  60  56 123  63
2725 148  50 200 857 165  65 175 208 183 315
2730 155  92 178  67 185 159 180  90 120 133
2769  70  90  70  58  85 105 251 129  89  45
2780  40 110  45  32  85 109  85  66 119 109
2781 106 183 186 173 207 133  60 150 390 230
2802  90 161 130  90  81 154 168 167 226 150
2805 100 130 162  81 182 156 300 100  77 142
2806 166 150  92 148  99  50 131  99 131  57
2807  60  50  90 120 110 105 270 105  93 182
2810 113  85 136 110 150 200  65  92  97 110
2813 300 160 143  66 167  40 123 265 123 215
2815  60  80 166 225  82 283 125  98  60  93
2825 130 148 223 140 157 260 150 310 155 200
2854 171 140  42 466  65 228 150 300 266 212
2869  42 213  90 122 138  69 177  99 121 145
2882 100  70  95  65 100  60  95  83 335  78
2884 130  93 125  80  80 190  57 136 283 113
2893 160 100 135  67  65 142 190 163 211 181
2915  81 187 169 140  64 300  90 210 153 125
2927  21  62  72 120  59  80 112 125 149  27
2935  88  86  50 134 120 195  90 102  73  90
2936 200 174 158  86 200 128 205 424 170 150
2939 115 130 120 173 122 137  85 100  53 115
2951 500 183 200 245 500 178 416 416  91 178
2954 300 250 200 700 120 125 260 144 230 959
2969 171 150  95 105 189 117 211 116 315 197
2971 300 124 211 230 100 123 164 185 173 173
2979  94  94 240 171 300 531 300 142 120 300
2983  57 197 110  99 116  85  76 250 128  80
2991 101  95  60  70  66  51  80  72  76  65
2996 120 157  66  82  67 104 118 268 160  69
2999 114 305 300 700 174  50 220 129 177 236
3008 300  60 288 125 125  60 250 137 275 459
3014 120 135 160 115  35 200 175  88 160 210
3021 233  42 130 192 110 100 102 198 210 270
3026  90 140  32  75 200 164 100 100  81 110
3031  90  74  73 175  33 182  81  60  60 130
3038  67  53  42  53  67 121 121  52  33 121
3040 340 211  42 298 210 320 345 128 158  92
3069 436 165 142  95 200 260 198  63 268 109
3080 120 106 117  92 121 150  92 140 101  83
3096 253  70 126 170  84 110 150 178  81 160
3104  76  64   8 150  75 168 165 125 130 129
3106 160 107 204 216 120 131 181 178 141 110
3110  74 190 191 150 266 111 371  41 166 183
3121 157  90 200 359 464 383 222  80 292  67
3123  72 130 130 188 210 231 167 187 243 171
3139 400 250 150 230  90 200 373 459 321 120
3167  55  85 162 110  57 210 128 120  52  80
3170 218 176 140 135 128 155  67 231 133 124
3183  67 180  80 120 100 145  43 179 114 148
3185 419 217 110  95 126  75 100 124 112  75
3187 156 125  78  63 100  97  85 159 113 245
3203 120  59  88 164  67  87  59 117 100  63
3218  78 114  70  52  75  70 140  80  55 105
3222  99 201 230 187 120 180 120 108  90  66
3229 150 112  77 135 139  62 100  73 132  45
3233 464  53  53 212 247 300 103 230 191 176
3237 123 210 145 135 103 158 215 142 256  95
3245 139 150  97 100 130  55  75 140 120  95
3252 125 107 140  47  95 280  80  88 172 116
3266 187 197 206 160 352 244 105  95 225 215
3286 136 100  70  65 129  89 160  60 225  96
3288  75 132 150 290 208 250  82 275  42 200
3304 200 491 178 491 150 183 183 200 178 491
3310 142 210 156  75 134 180 125  55 250  80
3316 177 197 237 102 159 121 101 124 196 124
3325  16  96 170 140 151 120 176 107  73 124
3336 190  39 100 179 114 120 135 112 110 123
3338 245 200 416 183 245 245 183 241 230 200
3345  83  80  39 211 195  78 225  35 214 400
3352 168 318 156 235 107  80  63 242 155 173
3365 140 297 219  70 156 131 105 149 165  38
3382 176 130 118 101 110  96 193  62  63 137
3433  80  66  92  75 140  75  93  77 130 102
3439 209 110  82  65 116  65  75  52 106 209
3451  85 139 105  92  42 139  70 136  77  50
3452  65  62  92  66 123  82  43  75  80  85
3454 109 233 173 120  63 131 107 300 145  90
3456  75  95 140  90 158 250  67  96  60 130
3461 133  90 154 240 140 120 132 121 101 300
3462 114 140 120 257 115  72 110  93  86 100
3473  78 160 120  65 185 110  80  87  95 160
3477  78 125 145 115  89 100 214  90 121 140
3478 300  90 152  70 165  93 123 150 145 140
3494  80 122  47  92  62 125 227 220 140  90
3513 129 113 180 160 100  88 225 172 100 145
3523 150  90 160 130 158  97 315  40  60  95
3525  78 109  83 116  95 169 100 168 160 150
3534  47 152 190 134 100 296 150 126 210 110
3556 203  61 186 205 246 218 200 125 214 146
3641 150  35 165 189 120 110 320 107 320 150
3645 122 232 110 125 113 105  95  72  90 115
3657 134 120  75 100 166 126  95 210  90  33
3674  42 175 120  90 120  48 151 173  53  63
3679 300 190 113 183 106 110 114 100 114 150
3691 178 122 203 170  70 139 135 144 120 175
3704 247 105  81 190  77 100 172 150 140 110
3709 300 150 235  72 240 143 130 475  50 116
3714  75 110 115 209  80 136 111  60  60 140
3717 190 110 114 140 130  67 105 120  63  90
3730 150 188  76 174 285 140 100 165 174  91
3740 141 350  80  90 156 135 160 140 135 106
3763  95  48 180 116  25 106  78 210  85 105
3768 115 131 201  96 145 100 247 192 324  81
3773 175 100 459 400 715 300 400 459 156 230
3794 142 130  72 179  82 106 112 102 102 217
3800  60  56 105  55  80 166  71  99  70  82
3823  16  83 176 156 140  78  66 141 133  50
3825  42  33 121 121  53 121 121  49  49  33
3850 130 120  80  65 182  88 100  46 148 200
3855  53  60 125 250 250 173  90 124 126 135
3857  70  85 102  70  59  82 139  88 100 108
3858 105  70 110 130  80  75 139  63 110  77
3882  65 124 108 170 164 100  36  86  70 260
3887 145 157 200 168 139 240 230 130 250 134
3892  20 111  89 183 154 100 110 174 146 211
3902 116 234  80 105 205 175 132 164 240 138
3914  83 190 115  73 173 129  58 166 134 150
3928 160 500 430  60 137 350 300 144 325 178
3932  21 158 117 120 283 140  78 100 125 141
3945 152 270 128 145  66  81  77 140 350  65
3946 175 144 428  70 214  50 642  65  70  79
3947  23  60  56 120  35  85  67  55  50  88
3951 120  17 177 193  70 280 145  95 107 135
3955 100 150 100  66  52  60  70  40  62  60
3966 105 166  80 163  81  98 106 131 140  42
3992 185  58  71 100  40  85 100 120  90 200
4003 208  80 139 104 134 136 225 130 164 115
4023 340 223 105 105  57 207 220 250 186 270
4036  60 219 106  90  65  92  85  62 175 139
4049  67  49 121 121  53  42  67  88  49  49
4064  50 140  78 104 210 123  98 125  60 148
4069 340 276 107 135 470 147  20 190 117 220
4076  31  56  80  88  49  63 165  19  37 121
4082 200 211 108  74  60 470  80 100 225 110
4085 400 300 382 384 350 170  90 229 165 101
4096  54 136 200 193 135 110 244  25  85 107
4119 125 120  82 102 135 190  74  85 126 135
4159 102  72 110  34 141  17 146  40 126 127
4168  52 117  66 136 100 103  70  72 132 100
4173  47 105  92  71  75 135  90  70 130 139
4181  75  70 140  53 125  98  65  95  92 149
4191 310 196 131 120 142  85 141 375 130 120
4198 189 138 111 959 151 204 176 330 442 250
4199 200 110 154  70 130 179 192  82 100  90
4222  79 120  79  72 180 200  60  78 150  92
4223  66 110 116  70 166  88  92  51 200 131
4237 185 117 114 170 113  92 100 140 132 125
4246 182 182 169 165 125  97 100  92 265 142
4247  65 100  25  60 150  70 129  60 105 150
4256 250 150 298 191 128 148 157 250 150 122
4281  99 125  74  38 139  83 150  60 105  19
4295 100  87 124  96  60  80  96  96  65  51
4333  90 130  60  52  75  47 110  55  82 147
4349 160 104 155 130  70 125  95  80 155 210
4368 125 119  92 128 118 140 110  36  58  46
4373 175 121 123 102 122 100  87 247  90 170
4398  85 155 100 115 169 111 115  67 126  63
4411 250 160 140 356 133 212 700  93 180 113
4420 150 491 905 905 241 905 491 905 905 200
4433 120  60 154 131  65 175  80 210 185 180
4436  92  80 125  52 144 110 145 140  78  70
4440 101 178 134 190  98 217 120 138 154 172
4441 260 200 248 162  85 205 162 168 107 246

$Assets
         1      2     3     4     5     6      7     8      9     10
30   15000  10000  1500  4000     0     0     18     0   2500      0
240   4500  32000 19400  6000 60000  4000   8000  8000   4500  39000
735   7000   7500 15000  5500 15000 10500  14000 18000  20000  80000
1060  3000      0  4000  4500     0     0    700     0   2000   3500
1129 11500   6000  5000  8500  5000 10000  12000  2500   8000  20000
1670  2500      0  3500     0  4000     0   3500 14000   4000   4000
1677     0   6000 10000 10000  8500  8500  40000 25000  13500   8500
1812  1000  25000  6500 10000  3000 30000   6000 60000   8000   6000
1845     0      0     0     0     0     0      0     0      0   1500
1878  2000      0     0     0     0  4000   1000  2500      0      0
1893 26000 150000 26000 80000 30000 20000 150000 50000 150000 200000
2074  3500  30000     0     0 18000  5000   5000  3500   3000   4700
2237     0      0     0     0     0  1800   3000     0    800      0
2291  7750  10000  7000 14000  2500  8000   7000 30000  10000   3000
2368     0      0     0     0     0     0   3000  3500      0      0
2389     0      0     0     0     0 12000      0     0      0   5500
2439     0      0     0     0     0     0      0     0      0      0
2449 13000   8000  8000     0  7000 10000  10000 10000   6000   5000
2473     0   5500  5000  3000  8000  6000   9000  4000   4000   6500
2530  5000   4162  3500  5000  4500  6000   3000  4000  12000   6000
2653  5000   7500     0  7100 10000  8000  10000  7000   2500  78000
2720  7000   7000  4000     0 25000  3500  25000 11000   6000   8000
2772  6000      0     0  3000  3000  2000   4000  2500   8000   6500
2857  5000   4000     0     0  3500     0      0     0   4000      0
2951 30000      0 30000 28000  8000 25000  90000 65000  50000  29500
2996     0  11000     0     0 16000 21500   5500  8000  15000  15000
3053 15000   9000  3500  7500  4500 10000   5000  4000   7000  11000
3183  5000   8000  4300 14000  5000  9000      0  4000   7500   5000
3196  5000   5000  7000 15000  5000 11000   5500 25000  11500  35000
3218  2000   8000  3000  2500  4000  2500   7500     0   4500   6000
3229  9000   9000  3500 10000  7000  6000   2000     0  10000   3000
3330     0      0     0     0  4000     0      0     0      0      0
3440  3500   3500 11500  2000  5000  7000  60000  4000   5000   6000
3549     0      0  4000     0  3000     0   3500  7000   4000      0
3647 15000   9000  6000  8000  3000  6000   7000  5000  12000   5000
3652     0      0     0     0     0     0      0     0      0      0
3661  3000   5000 15000     0     0  5000   4000     0   4000   6000
3821     0  45000     0  6000  2000  4000   4000  3000  13500   7500
4035 25000   5000  5000     0  5000  3000   1600     0      0   3500
4074  6000  30000  3000  6500  8000  2000   4650 20000  10000  15000
4111  8000   7000  4000 15000  3500  3500   2200  6000   8000   3000
4119  6000   8000  8000 11000  5000 30000   3000  9000   2500  10000
4168 24000   5000  3000  8000  9000  6000   7000  4000  20000  15000
4187 14000      0     0     0     0     0      0     0      0      0
4192  3500   3500  4500  4000     0  2800   4000 15000   4000   3500
4288     0      0     0  4500     0  3500   4000     0      0      0
4446  3000   4000     0 10000  6000  6000   6500 17000   6000   4000

$Debt
        1    2   3     4    5    6    7 8    9    10
30      0 2000 500     0    0    0    0 0    0     0
240  1300 1000   0   480    0    0    0 0    0  3000
1060    0    0   0     0    0    0    0 0    0     0
1677  960 1000   0     0    0    0    0 0    0     0
1812    0    0   0     0    0 1408    0 0    0     0
1845    0    0   0     0    0    0    0 0    0     0
1878    0    0 500     0    0    0  600 0  360  1749
1893    0  500   0 15000    0    0    0 0    0 15000
2074    0    0 500     0 3130    0 3000 0    0  3378
2237    0    0   0     0    0    0    0 0    0     0
2389    0    0   0   500    0    0  108 0    0     0
2449    0  500   0     0    0    0    0 0    0     0
2653 2000    0   0     0 1500    0    0 0    0     0
2951 9300    0   0  1300    0    0    0 0  933  3000
2996    0  360   0     0  500    0    0 0 3700     0
3218    0 2000   0     0 1500 2800    0 0    0  3000
4074  600    0   0     0    0    0    0 0    0     0
4288    0    0   0     0    0    0    0 0    0     0

$Amount
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

$Price
 [1] 1  2  3  4  5  6  7  8  9  10
<0 rows> (or 0-length row.names)

Statistical Modeling

We can check the quality of the imputations by running a strip plot, which is a single axis scatter plot. It will show the distribution of each variable per imputed data set. We want the imputations to be values that could have been observed had the data not been missing.

Code
par(mfrow=c(7,2))
stripplot(Multiple_Imputation, Status, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Seniority, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Home, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Time, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Age, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Marital, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Records, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Job, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Expenses, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Income, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Assets, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Debt, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Amount, pch = 19, xlab = "Imputation number")

Code
stripplot(Multiple_Imputation, Price, pch = 19, xlab = "Imputation number")

Next, we will pool the results of the complete dataset with the imputed dataset to arrive at estimates that will properly account for the missing data. We fit the complete model with the with() function and display the summary of the pooled results. It will give us the estimate, standard error, test statistic, degrees of freedom, and the p-value for each variable.

Code
# fit complete-data model
fit <- with(Multiple_Imputation, glm(Status ~ Seniority + Home + Time + Age + Marital + Records + Job + Expenses + Income + Assets + Debt + Amount + Price, family = binomial))

# pool and summarize the results
summary(pool(fit))
               term      estimate    std.error    statistic         df
1       (Intercept)  9.672533e-01 7.341406e-01   1.31753137 4218.92797
2         Seniority  8.302704e-02 7.509116e-03  11.05683361 4026.09993
3         Homeother  6.847181e-02 5.740463e-01   0.11927924 4408.20935
4         Homeowner  1.151566e+00 5.598295e-01   2.05699445 4422.65325
5       Homeparents  9.460827e-01 5.683979e-01   1.66447262 4416.87452
6          Homepriv  4.143875e-01 5.773215e-01   0.71777591 4418.33662
7          Homerent  4.162157e-01 5.632987e-01   0.73888974 4409.85773
8              Time  6.772940e-05 3.486236e-03   0.01942766 4226.07583
9               Age -1.092253e-02 5.000167e-03  -2.18443281 4164.62555
10   Maritalmarried  6.137484e-01 4.234126e-01   1.44952805 2987.91483
11 Maritalseparated -6.762074e-01 4.664549e-01  -1.44967375 3525.81248
12    Maritalsingle  1.581997e-01 4.283363e-01   0.36933532 3172.24218
13     Maritalwidow  1.736481e-01 5.328093e-01   0.32591039 3649.32932
14       Recordsyes -1.785754e+00 1.021443e-01 -17.48265446 4339.76452
15     Jobfreelance -7.612312e-01 1.020269e-01  -7.46107977 4198.03391
16        Jobothers -7.117123e-01 2.047266e-01  -3.47640329 3215.47838
17       Jobpartime -1.472669e+00 1.260182e-01 -11.68615527 4373.09140
18         Expenses -1.521722e-02 2.637894e-03  -5.76870255 3189.84443
19           Income  7.267654e-03 8.237279e-04   8.82288088   86.99121
20           Assets  2.169205e-05 6.920445e-06   3.13448801  158.81379
21             Debt -1.715864e-04 3.765713e-05  -4.55654458  232.74078
22           Amount -1.949321e-03 1.722167e-04 -11.31900125 3937.75396
23            Price  8.748681e-04 1.266726e-04   6.90652888 3894.76779
        p.value
1  1.877321e-01
2  5.120915e-28
3  9.050596e-01
4  3.974528e-02
5  9.608895e-02
6  4.729334e-01
7  4.600133e-01
8  9.845009e-01
9  2.898604e-02
10 1.472951e-01
11 1.472385e-01
12 7.119025e-01
13 7.445108e-01
14 3.435636e-66
15 1.037176e-13
16 5.149181e-04
17 4.329561e-31
18 8.752264e-09
19 1.036059e-13
20 2.051064e-03
21 8.396911e-06
22 2.977718e-29
23 5.775505e-12

Conclusion

In conclusion, missing data can occur in research for a variety of reasons. It is never a good idea to ignore it. Doing this will lead to biased estimates of parameters, loss of information, decreased statistical power, and weak reliability of findings (Dong and Peng 2013). The best course of action is to impute the missing data by using multiple imputation. When missing data is discovered, it is important to first identify it and look for missing data patterns. Next, define the variables in the dataset that are related to the missing values that will be used for imputation. Create the necessary number of complete data sets. Run the models and combine them using the imputed values, and finally, analyze the complete dataset. Performing these steps will minimize the adverse effects caused by missing data on the anaylsis (Pampka, Hutcheson, and Williams 2016).

References

Arnab, R. 2017. Survey Sampling Theory and Applications. Academic Press. https://www.sciencedirect.com/topics/mathematics/imputation-method.
Azur, M. J., E. A. Stuart, C. Frangakis, and P. J. Leaf. 2011. “Multiple Imputation by Chained Equations: What Is It and How Does It Work?” Int J Methods Psychiatr Res. 20 (1): 40–49. https://onlinelibrary.wiley.com/doi/epdf/10.1002/mpr.329.
Cao, Y., H. Allore, B. V. Wyk, and Gutman R. 2021. “Review and Evaluation of Imputation Methods for Multivariate Longitudinal Data with Mixed-Type Incomplete Variables.” Statistics in Medicine 41 (30): 5844–76. https://doi-org.ezproxy.lib.uwf.edu/10.1002/sim.9592.
Dong, Y., and C. J. Peng. 2013. “Principled Missing Data Methods for Researchers.” SpringerPlus 2 (222). https://doi.org/10.1186/2193-1801-2-222.
Mainzer, R., M. Moreno-Betancur, C. Nguyen, J. Simpson, J. Carlin, and K. Lee. 2023. “Handling of Missing Data with Multiple Imputation in Observational Studies That Address Causal Questions: Protocol for a Scoping Review.” BMJ Open 13: 1–6. http://dx.doi.org/10.1136/bmjopen-2022-065576.
Pampka, M., G. Hutcheson, and J. Williams. 2016. “Handling Missing Data: Analysis of a Challenging Data Set Using Multiple Imputation.” International Journal of Research & Method in Education 39 (1): 19–37. https://doi.org/10.1080/1743727X.2014.979146.
Pedersen, A. B., E. M. Mikkelsen, D. Cronin-Fenton, N. R. Kristensen, T. M. Pham, L. Pedersen, and I. Petersen. 2017. “Missing Data and Multiple Imputation in Clinical Epidemiological Research.” Clinical Epidemiology 9: 157–66. https://www.tandfonline.com/doi/full/10.2147/CLEP.S129785.
Schafer, J. L., and J. W. Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7 (2): 147–77. https://psycnet.apa.org/doi/10.1037/1082-989X.7.2.147.
Streiner, D. L. 2008. “Missing Data and the Trouble with LOCF.” EBMH 11 (1): 1–5. http://dx.doi.org/10.1136/ebmh.11.1.3-a.
Thongsri, T., and K. Samart. 2022. “Composite Imputation Method for the Multiple Linear Regression with Missing at Random Data.” International Journal of Mathematics and Mathematics and Computer Science 17 (1): 51–62. http://ijmcs.future-in-tech.net/17.1/R-Samart.pdf.
van_Ginkel, J. R., M. Linting, R. C. Rippe, and A. van der Voort. 2020. “Rebutting Existing Misconceptions about Multiple Imputation as a Method for Handling Missing Data.” Journal of Personality Assessment 102 (3): 2812–31. https://doi.org/10.1080/00223891.2018.1530680.
Wongkamthong, C., and O. Akande. 2023. “A Comparative Study of Imputation Methods for Multivariate Ordinal Data.” Journal of Survey Statistics and Methodology 11 (1): 189–212. https://doi.org/10.1093/jssam/smab028.
Wulff, J. N., and L. E. Jeppesen. 2017. “Multiple Imputation by Chained Equations in Praxis: Guidelines and Review.” Electronics Journal of Business Research Methods 15 (1): 41–56. https://vbn.aau.dk/ws/files/257318283/ejbrm_volume15_issue1_article450.pdf.